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AMONG DIFFERENT TRANSCODING CHANNELS 

BACKGROUND OF THE INVENTION 

The present invention relates to a statistical 
remultiplexer for transcoding digital video signals. 

Commonly, it is necessary to adjust a bit rate of 
digital video programs that are provided, e.g., to 
subscriber terminals in a cable television network or 
the like. For example, a first group of signals may be 
received at a headend via a satellite transmission. 
The headend operator may desire to forward selected 
programs to the subscribers while adding programs 
(e.g., commercials or other content) from a local 
source, such as storage media or a local live feed. 
Additionally, it is often necessary to provide the 
programs within an overall available channel bandwidth. 

Accordingly, the statistical remultiplexer (stat 
remux) , or multi-channel transcoder, which handles pre- 
compressed video bit streams by re-compressing them at 
a specified bit rate, has been developed. 

In such systems, a number of channels of data are 
processed by processors arranged in parallel. Each 
processor typically can accommodate multiple channels 
of data. Although, in some cases, such as for HDTV, 
which require many computations, portions of data from 
a single channel may be allocated among multiple 
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processors. Moreover, typically a fixed transcoding 
bandwidth is allocated to one or more groups of 
channels (stat remux groups) . 

However, there is a need for an improved stat 
5 remux system that provides a bit rate need parameter 

for each channel to enable bits to be allocated for 
transcoding the channels in a manner that optimizes the 
image quality of the coded data, while still meeting 
the constraints of a limited throughput. 

10 The system should estimate the bit rate need 

parameter from statistical information that is derived 
from the bitstream, such as a frame bit count and 
average quantizer scale values of the original 
bitstream. The system should be compatible with MPEG-2 

15 bitstreams. The system should allocate a target output 

frame bit count for I, P and B frames based on the 
coding complexity estimated from the statistical 
information of the original bit stream. 

Moreover, the system should accommodate MPEG-2 

20 macroblock processing within a frame, by using a 

macroblock bit count and quantizer scale values of the 
original bit stream to guide the rate control process 
to meet the target frame bit count at the output. 

The system should provide periodic adjustments of 

25 an allocated transcoding bit rate a number of times in 

a video frame. 

Additionally, the system should derive quantizer 
scale values for transcoding macroblocks in a frame 
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based on original, pre-transcoding quantizer scale 
values. The quantizer scale values should be adjusted 
as transcoding of a frame proceeds to ensure that each 
macroblock is allocated a minimum number of bits for 
5 transcoding. 

The present invention provides a system having the 
above and other advantages. 
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SUMMARY OF THE INVENTION 

The present invention relates to a statistical 
remultiplexer for transcoding digital video signals. 
In one aspect of the invention, a bit rate need 
5 parameter is estimated for statistical re-multiplexing 

from a frame bit count and average macroblock quantizer 
scale values (averaged over a frame) of an original 
bitstream, such as an MPEG-2 bitstream. A lookahead 
of, e.g., five-frames is provided. 
10 The invention allocates a total available 

bandwidth among the transcoding channels. 

The invention allocates a target of output frame 
bit count for I, P and B frames based on the coding 
complexity estimated from the frame bit count and the 
15 average macroblock quantizer scale (averaged over each 

input frame) of the original bit stream. 

Furthermore, in another aspect of the invention, 
during MPEG-2 macroblock processing within a frame, a 
macroblock bit count and quantizer scale value of the 
20 original bit stream are used to guide the rate control 

process to meet the target frame bit count at the 
output. 

Thus, the present invention provides an efficient 
statistical remultiplexer for processing data in a 
25 number of channels that include video data. In one 

aspect of the invention, transcoding of the video data 
is delayed while statistical information is obtained 
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from the data. Bit rate need parameters for the data 
are determined based on the statistical information, 
and the video data is transcoded based on the 
respective bit rate need parameters following the 
5 delay. 

In another aspect of the invention, a transcoding 
bit rate for video frames at the stat remux is updated 
a plurality of times at successive intervals to allow a 
closer monitoring of the bit rate. Moreover, minimum 

10 and maximum bounds for the transcoding bit rate are 

updated in each interval. Thus, a portion of a frame 
is transcoded in a first interval, then the transcoding 
bit rate is updated, then a second portion of the frame 
is transcoded in a second interval, then the 

15 transcoding bit rate is updated again, and so forth. 

In yet another aspect of the invention, the pre- 
transcoding quantization scales of the macroblocks in a 
frame are scaled to provide corresponding new 
quantization scales for transcoding based on a ratio of 

20 a pre-transcoding amount of data in the frame and a 

target, post-transcoding amount of data for the frame. 
Moreover, the quantization scales are adjusted for 
different portions of the frame as the portions are 
transcoded to ensure that a minimum amount of 

25 transcoding bandwidth is allocated to each macroblock. 

Corresponding methods and apparatuses are 
presented. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates a stat remux, and a data flow 
into and out of a quantization level processor (QLP) , 
in accordance with the present invention. 
5 FIG. 2 illustrates a simplified transcoder for use 

in accordance with the present invention. 

FIG. 3 illustrates a transcoder that performs 
requantization without motion compensation for use in 
accordance with the present invention. 
10 FIG. 4 illustrates an end-to-end stat remux 

processing delay in accordance with the present 
invention . 

FIG. 5 illustrates a transcoder video buffering 
verifier (VBV) model in accordance with the present 
15 invention. 

FIG. 6 illustrates transcoder rate timing in 
accordance with the present invention. 

FIG. 7 illustrates communication timing between a 
QLP and transcoder processing elements (TPEs) in 
20 accordance with the present invention. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to a statistical 
remultiplexer for transcoding digital video signals. 

The following acronyms and terms are used: 
5 BW - Bandwidth 

DCT - Discrete Cosine Transform 

DTS - Decoding Time Stamp 

ES - Elementary Stream 

FIFO - First-In, First-Out 
10 KP - Kernel Processor 

MTS - MPEG Transport Stream 

PCI - Peripheral Component Interconnect 

PCR - Program Clock Reference 

PES - Packetized Elementary Stream 
15 PID - Program Identifier 

Q - Quantization 

QLP - Quantization Level Processor 
SDRAM - Static Dynamic Random Access Memory 
TP - Transport Packet 
20 TPE - Transcoder core Processing Element 

VLD - Variable-Length Decoding 
VLE - Variable-Length Encoding 

FIG. 1 illustrates a stat remux, and a data flow 
into and out of a QLP, in accordance with the present 
25 invention. 

The stat remux 100 includes a groomer 105 for 
receiving a number of input transport streams. 
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Corresponding input transport packets in different 
video services are provided to one of a number of 
transcoders 110, 112, or TPEs (transcoding engines). 
Typically, each transcoder can handle one or more video 
5 services (channels) . Transcoded data is provided via a 

PCI bus 115 to a multiplexer (mux) 120, which assembles 
a corresponding output transport stream. 

A Kernel Processor (KP) configures the groomer 
105, the TPEs 110, 112, the QLP 130, and the Mux 
10 120. 

In particular, the mux 120 is responsive to a 
transmission bit rate provided from a QLP 130, which 
has a memory 132 such as a SDRAM. The QLP may be 
implemented using a media processor, such as the 
15 MAPCA2000 (300MHz) media processor from Equator 

Technologies, Inc. The QLP 130 performs the following 
functions : 

• Allocates an available bandwidth to the output 
video services to optimize the video quality and 

20 determine the target frame size for each frame to 

be transcoded. 

• Receives configuration parameters from the Kernel 
Processor. 

• Reports operational status and statistics back to 
25 the Kernel Processor. 

The QLP communicates with the KP and TPEs via the 
PCI bus (32bit @ 66MHz) . A block of memory is 
allocated on an SDRAM of the QLP for interprocessor 
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communication. This memory block is "shared" by the 
QLP with other processors. 
Inputs to QLP 130 

• Configuration parameters and commands (Source: KP 
140) 

•. Video and associated audio and data input packet 
rate information (Source: TPEs 110, 112) 

• Statistics of the input frame to be transcoded 
(Source: TPEs) 

• Timing information of the input frame (Source: 
TPEs) 

• Statistics of the output frame just transcoded 
(Source: TPEs) 

• Timing information of the output frame (Source: 
TPEs) 

• Transcoder FIFO level (Source TPEs) 

• Non-video (data) input packet rate information 
(Source: Mux 120) 

Outputs from QLP 130 

• Status and statistics (Destination: KP) 

• TPE service assignment information (Destination: 
KP) 

• Transmission bit rate (Destination: Mux) 

• Transcoding target frame size (Destination: TPE) 

• Maximum and Minimum frame size for buffer 
protection (Destination: TPE) 
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• Minimum number of PCRs to be inserted into a frame 
( Destination : TPE ) 

• Flag to command the TPE to passthrough a frame 
(Destination: TPE) 

1 . Overview 

Although the transcoder 100 is not necessarily 
decoding and re-encoding the video stream, the 
transcoding function can emulate a full decode and re- 
encode. The rate control and stat-remux system in 
accordance with the invention is summarized in the 
following steps. Details are described in the next 
sections . 

1. Each TPE 110, 112 inputs the transport 
stream of every video channel it is processing. The 
transport stream is then unpacketized and a video 
decoder buffer is emulated. A lookahead buffer is used 
by each TPE to store a number of future frames and 
obtain statistical information from these frames. In 
particular, for every input frame to be transcoded, the 
TPE computes the average quantizer scales values and 
the number of bits in the input frame. These 
parameters are used by the QLP 130 to calculate the bit 
rate need parameter for the input frame at a scheduled 
time that is at least 1.5 NTSC frame times before it is 
that frame's turn to be transcoded. In particular, 
because of the possible use of the 3:2 pulldown format 
in the video channels, a coded frame can be either one 
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NTSC frame time (33.3ms) or 1.5 frame times (50ms). We 
use the longer time to make sure the transcoding rate 
allocation for the frame is determined before the 
actual transcoding begins. 
5 2. The QLP 130 performs a bandwidth allocation 

process to allocate a transcoding bit rate to the TPEs 
at periodic intervals,. T q . It computes the transcoding 
bit rate for every video channel for each Tq interval. 
The transcoding bit rate is stored in a queue at the 

10 QLP 130 and delayed for (0.5 sec + 3 NTSC frame 

periods) =0.6 sec, rounded to the nearest Tq period. 
One NTSC frame period is 1/30 sec. The delayed value 
of the transcoding bit rate of the video channel 
becomes the transmission bit rate, and is used by the 

15 Mux 120. 

3. While a frame is in the lookahead buffer of a 
TPE, the average transcoding bit rate over the frame is 
used to derive an initial value of a target frame size, 
which is a predicted size of the frame after 

20 transcoding. This initial target frame size value is 

stored in an output frame size queue (e.g., in memory 
132) of the QLP 130, and is retrieved when the 
associated TPE is ready to transcode the frame. Queues 
may be implemented by the QLP in the memory 132. 

25 4. When the transcoder is ready to transcode a 

new frame, the initial target frame size value that was 
previously determined is retrieved from the output 
frame size queue. Based on the current state 
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(fullness) of the transcoder buffer, the maximum and 
minimum frame size to protect the decoder buffer from 
underflow or overflow are calculated for bounding the 
initial target frame size. 
5 5. If the number of target bits (target frame 

size) is greater than (or close to, within a 
predetermined tolerance - see section 6.3) the number 
of bits in the input frame, this frame bypasses 
transcoding in a passthrough mode since the purpose of 

10 transcoding is to reduce the number of bits in a frame. 

This situation may occur when the input bitstream is 
already heavily compressed. When a frame is bypassed, 
the associated input elementary stream is re- 
packetized. If the number of target bits is smaller 

15 than the number of bits in the input frame, the frame 

is bit-reduced (transcoded) . Bit reduction may be 
performed, e.g., either through a simplified transcoder 
architecture (FIG. 2) or re-quantization (FIG. 3) . 

6. The quantizer scale for each macroblock (or 

20 group of macroblocks) for transcoding is chosen based 

on the number of target bits per frame, and the 
original quantizer scale. The condition that the 
output quantization scale is higher (coarser) than the 
input quantization scale must be met. 

25 7. During transcoding, the TPEs have to allocate 

certain slots for a PCR field in the packets they 
output. It is important to avoid allocating more slots 
than necessary since this wastes bits. Thus, the 
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outgoing packets at a TPE are created and stored in 
memory, e.g., in a TPE FIFO buffer. Moreover, the QLP 
130 uses the target frame size to estimate the time 
used for transmitting the frame, and hence the time for 
5 inserting the PCRs. 

Moreover, a PCR slot is created at least every 0.1 
second to conform to the MPEG2 system standard 
requirement . 

8. At each n*Tq period, the Mux 120 reads the 
10 number of packets assigned to each channel from the 

TPEs 110, 112 via the PCI bus 115. M n" is a design 
parameter for the Mux 120 and can be any positive 
integer. This packet assignment is equivalent to the 
transmission bit rate allocation, which is in turn a 
15 delayed version of the transcoding bit rate allocation. 

That is, the bit rate is converted to a number of 
packets to send to the Mux per Tq period. 

9. The Mux 120 receives a transport tick every m 
ticks of a 27MHz clock. If the packet to transmit 

20 contains a PCR, the Mux performs PCR correction to 

provide a PCR value that is properly synchronized with 
a master clock of the transcoder. This may achieved as 
described in commonly-assigned, co-pending U.S. Patent 
Application No. to R. Nemiroff, V. Liu and S. 

25 Wu, filed on , and entitled "Regeneration Of 

Program Clock Reference Data For Mpeg Transport 
Streams." The transport packet (s) are sent out the Mux 
Processor via the PCI bus 115. The transport tick 
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refers to the timing interval for outputting a 
transport packet. 

FIG. 2 illustrates a simplified transcoder for use 
in accordance with the present invention. 
5 While a straightforward transcoder can simply be a 

cascaded MPEG decoder and encoder, the transcoder 200 
provides a simplified design that reduces computations. 
The transcoder architecture 200 performs most 
operations in the DCT domain, so both the number of 

10 inverse-DCT and motion compensation operations are 

reduced. Moreover, since the motion vectors are not 
recalculated, the required computations are 
dramatically reduced. This simplified architecture 
offers a good combination of both low computation 

15 complexity and high flexibility. 

In particular, a pre-compressed video bitstream is 
input to a Variable Length Decoder (VLD) 215. A 
dequantizer function (inverse quantizer) 220 processes 
the output of the VLD 215 using a first quantization 

20 step size, Qi. 

Motion vector (MV) data is provided from the VLD 
215 to a motion compensation function 235, which is 
responsive to a previous frame buffer 250 and/or a 
current frame buffer 245 of pixel domain data. A DCT 

25 function 270 converts the output of the MC function 235 

to the frequency domain and provides the result to an 
adder 230. A switch 231 passes either the output of 
the adder 230 or the Qi" 1 function 220 to a quantization 
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function Q 2 275, which quantizes the data, typically at 
a coarser level to reduce the bit rate. This output 
is then inverse quantized at an inverse quantization 
function Q 2 _1 282 for summing at an adder 286 with the 
output of the switch 231. The output of the adder 286 
is provided to an IDCT function 284, and the output 
thereof is provided to the frame buffers 245 and 250. 

A Variable Length Encoder (VLE) 280 codes the 
output of the Quantization function 275 to provide at 
output bitstream at a reduced bit rate. The bit output 
rate of the transcoder is thus adjusted by changing Q 2 . 

FIG. 3 illustrates a transcoder 300 that performs 
requantization without motion compensation for use in 
accordance with the present invention. 

Here, only re-quantization is applied to a frame, 
without motion compensation. Generally, IDCT and DCT 
operations are avoided. This strategy incurs a lower 
complexity, but causes some artifacts in the output 
data. The DCT coefficients are de-quantized then re- 
quantized . 

In particular, a VLD 410, inverse quantizer 420, 
quantizer 430 and VLE 440 are used. 

2. End-to-end Processing Delay 

FIG. 4 illustrates an end-to-end stat remux 
processing delay in accordance with the present 
invention. 
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An example one of the transcoders or TPEs 110 
includes an MTS buffer 405 for buffering the input 
transport stream, a demux 410 for separating out the 
elementary streams of the different services in the 
5 transport stream, and an ES buffer 415 for storing the 

ESs streams. The ES data is variable-length decoded at 
a VLD function 420, and the result is provided to a 
lookahead delay buffer 425, with a capacity of, e.g., 
five frames. After a one-frame delay at a buffer 435, 

10 a frame is transcoded at a transcode function 440, and 

the result is stored in a transcode buffer 445. A 
remultiplexer (remux) 450 combines data from the 
transcode buffer 445 and data, if present, from a 
transport stream delay buffer 430, and the resulting 

15 transport stream is communicated to a decoder 452, such 

as a set-top box in a broadband communication network. 
The transport stream delay buffer 430 is used for the 
bypass frames, discussed previously, that are not 
transcoded. The bypass frames are delayed to maintain 

20 synchronicity with the other channels that are 

transcoded . 

Note that, in practice, the output stream from the 
transcoder 110 is combined with other transport steams, 
from the other transcoders to form a transport 
25 multiplex that is communicated to a representative 

decoder 452. The decoder 452 includes a FIFO buffer 
455 that buffers the incoming data, and a decoding 
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function 4 60 that decodes the data to provide an 
output, e.g., for display on a television. 

A buffer delay of, e.g., 0.5 seconds, which can 
vary in different implementations, is experienced by 
5 the video packets. This is a delay between the 

transcoding (encoding) time and the decode time. This 
delay occurs in both the transcoder (output) video FIFO 
445 and the decoder FIFO 455. The buffer delay is 
fixed. If the transcoding time is delayed, the actual 
10 transcode time to decode time is shortened, but the 

transcode 'tick 1 to decode time is fixed by the buffer 
delay. 

3. Buffer Model 

FIG. 5 illustrates a transcoder VBV model in 
15 accordance with the present invention. 

The vbv model of the decoder 452 is used to limit 
the maximum and minimum frame size before transcoding a 
new frame. The level of the transcoder' s bitstream 
output FIFO can be used to derive the decode buffer 
20 status just before the DTS of the new frame (see also 

FIG. 7) . Specifically, the future decode buffer status 
( vbv_fullness) is given by 

vbv_fullness = (No. of bits to be transmitted 
from current time to the DTS time of the new frame) - 
25 (No. of bits in the encoder FIFO ) . The vbv__fullness 

calculation is shown in FIG. 5, where the composition 
of the transcoder FIFO is shown at 500, and the 
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corresponding composition of the decoder FIFO is shown 
at 550. 

Moreover, we can compute the number of bits 
transmitted by adding all the encoding rates starting 
5 from t seconds up to the last encoding rate issued by 

the QLP, (where t is the total delay through the encode 
plus decode buffers, i.e., the system delay.) The QLP 
provides the bit rate in a number of packets to output 
for the time period T q . 

10 Moreover, a margin needs to be added due to the 

uncertainty caused by the variable latency from the 
time the QLP issues a rate change to the time the new 
rate actually takes effect at the transcoder. The new 
bit rate is changed at the fixed period T q . T q is 

15 asynchronous to the video frame time (DTS of decoder) . 

As shown in FIG. 6, the transcoding rates are 
computed at the dashed lines, e.g., 602, 604, 606, ... 
This example assumes the system delay is three frames 
and the transmission rate needs to be computed for Pl-1 

20 to Pl-3. With this notation, Pl-1 denotes 

Program (bitstream #1) frame #1, Pl-2 denotes 
Program (bitstream #1) frame #2 , and so forth. Since 
the frame DTS times do not align with the rate changes., 
this causes a difference between the transcoding rate 

25 and the transmission rate. Moreover, since the T q 

period straddles two frames, the second (later) frame 
is assigned those packets. 
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The worst case rate error between transcoding and 
transmission is the difference in the number of packets 
allocated at the current time and the number of packets 
assigned a system delay time later (DTS of the current 
5 frame) . The bottom of FIG. 6 shows the two extreme 

cases. In the first case (650), the frame DTS occurs 
just prior to T q . In the second case 670, the frame DTS 
occurs at T q + . A through AA represent the number of 
packets assigned to each T q period. Both cases have the 
10 same encoding packet assignment, sum (B through X) . 

The number of transmission packets for case 1 is 
Sum(C,D,E, ... ,W,X,Y); and, for case 2, Sum( B,C,D, ... 
,V,W,X). Case 2 has no difference between encoding and 
transmission rates, so this is the best case (DTS 
15 aligned with T q ) . Case 1, which is the worst case, has 

a difference of B packets. 

Therefore, the estimated number of bits to be 
transmitted from the current time to the DTS time is: 
£ transcoding_packets (n) ± packet_count_error, 
20 packet_count_error = (number of packets assigned 

at DTS time) - (number of packets assigned at current 
time) . A positive and negative packet_count_error has 
different effects on frame size calculations. 

Note that system delay should be a multiple of T q . 
25 With the estimated vbv fullness given by 

vbv_fullness = (no. of bits to be transmitted) - 
(bits in transcoder FIFO) ; 

this value can be used to limit the trascoded 
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frame size, so it will be no more than vbv_f ullness. 
This requirement is imposed to ensure the decoder 
buffer will not underflow while decoding the current 
frame (i.e., the frame that is about to be transcoded) . 
5 The maximum frame size and minimum frame size can 

be derived from the sequence of transmission bit rate, 
snapshot of the transcoder buffer level, and the sizes 
of the previously transcoded frame, as follows: 
Let B(t) = Buffer level at time t. 
10 tc = time when the current frame enters the 

transcoder FIFO. 

tO = time when the transcoder FIFO level was 

last read. 

T(t) = Size of the frame entering the FIFO at 



15 time t 



R(t) = transmission bit rate at time t 
dts = DTS of the current frame. 
nextDts = DTS of the next frame 
D = decoder buffer size. 

dts 



20 Maximum Frame Size = ^R(t)-B(tc) 



t=(0 t=(0 



/=/0 r=/0 
dts tc 

= 5>(0-*(K»-£r(0 

t=tO t=(0 

nextDts 

Minimum Frame Size = ^R(t)- B(tc)- D 



t-tc 
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nextDts 

= Maximum Frame Size + ^R(t)-D 

t~dts 

FIG. 7 illustrates communication timing between a 
quantization level processor (QLP) and transcoder 
processing elements (TPEs) in accordance with the 
5 present invention. 

At time 705, a TPE sends statistical information 
for a current frame "N" to the QLP. At time 710, the 
TPE sends information regarding the fullness of the 
TPE's output buffer, which includes data from a 

10 previously transcoded frame with an index of N-k, e.g., 

where k=l. The previously coded frame is usually frame 
N-l, i.e., the previous frame. However, sometimes it 
might take more than one frame time to transcode a 
frame so the timing might "slip". In that case, the 

15 distance between the "current frame" and the frame just 

transcoded may be more than 1 frame, e.g., such that 
k=2. At time 715, transcoding starts for frame "N" 
using the need parameter calculated from the associated 
statistical information. The transcode bit rate is 

20 calculated for each Tq period, such as at example time 

720. 

Time 725 denotes the start of transcoding for the 
next frame, with index N-l. 

At an example time 730, the TPE sends information 
25 regarding the fullness of its output buffer, which now 

contains data from frame N, to the QLP. In response, 
the QLP provides a target frame size, and minimum and 
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maximum bounds for the transcoding bit rate, to the TPE 
at time 7 35. 

Times 740 and 745 denote the times of the decode 
time stamps of frames N and N+l, respectively. 
5 At time 750, the QLP delivers a transmission bit 

rate to the mux to inform the mux how many packets of 
data in the TPE's output buffer to output in a 
transport stream. This time 750 follows the transcode 
time 720 by a delay period. 

10 4. Need Parameter 

A bit rate need parameter is determined for each 
frame based on an expected complexity of the frame. An 
transcoding bit rate is allocated to each TPE by the 
QLP 130 based on the need parameters and the available 

15 bandwidth. 

Referring again to FIG. 4, the bits of an input 
frame are first partially decoded by the variable 
length decoder 420, and average quantizer-scales and 
the number of bits in the frame are computed. A number 

20 of frames, e.g., five frames, of partially decoded 

coefficients and headers are stored for each video 
channel in the lookahead buffer 425, which provides a 
corresponding lookahead delay. The size of the 
processor SDRAM memory 132 limits the length of the 

25 lookahead buffer. Each coefficient takes two bytes, 

resulting in 720x480x1.5x2 = 1 Mbyte/frame. 
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At a specific time, T frame start, determined by the 
intended decode time of the frame at the target decoder 
452, the need parameter is computed for the oldest 
frame in the lookahead buffer 425. The decode time is 
5 specified by the DTS of the frame, which is in units of 

27MHz clock ticks. T fra mestart is defined as: 

(Decode time of the frame - buffer delay - 1.5 
NTSC frame time) . 

The need parameter is computed from the average 
10 quantizer scale and the bit count of the input frames, 

as follows: 

NeedParameter = MbResolutionAdjust * AvgQR * ( 
CurrentQR + Alpha * PastQR) / ( Beta * CurrentQR + 
PastQR ) , where 
15 AvgQR = (sum of (avglnQuant * inFrameSize) over 

the most recent 15 P or B frames and the most recent I 
frame in the past) * 900,000 / (DTS of current frame - 
DTS of the 16 th frame in the past). 900,000 is the 
number of 27 MHz units in one frame period (1/30 sec.) 
20 27 MHz is the MPEG clock rate. 

If there is no I frame within the past, e.g., 45 
frames, the past 16 P or B frames are used. 

For an I frame, 

CurrentQR = avglnQuant * inFrameSize of the 
25 current frame. 

PastQR = avglnQuant * inFrameSize of the last I 
frame. If there is no I frame within the past 45 




24 



frames, PastQR is set to be the same value of 
CurrentQR. 

For a P or B frame, 

CurrentQR = average of (avglnQuant * inFrameSize) 
5 over the current frame and every frame in the lookahead 

buffer 425 of the same picture type. 

PastQR = average of (avglnQuant * inFrameSize) 
over past four frames of the same picture type. If 
there are less than 4 frames of the same picture type 
10 in the past, PastQR is set to the same value as 

CurrentQR. 

Alpha and Beta are adjustable parameters to 
control the reaction of the need parameter to the 
change in the product of quantizer scale and bit count. 

15 Default values are Alpha = 256, Beta = 256. 

MbResolutionAdjust is an adjustable parameter to 
compensate the perceptual difference in distortion in 
different resolution. The lower the resolution, the 
more visible the distortion. Therefore the need 

20 parameter is boosted for lower resolutions. Default 

values of MbResolutionAdjust are 1.0 for full 
resolution, 1.2 for three-quarter resolution, and 1.5 
for half resolution. Alternatively, or in addition, 
the need parameter may be adjusted based on a 

25 macroblock resolution, which is the number of 

macroblocks in a frame. 



5 . Input Bit Rate Information 

In every Tq time slot, the TPEs 110, 112 and 
Mux 120 count the number of input transport packets and 
save this packet count information in circular buffers 
on the QLP 130. There is one circular buffer of input 
bit rate information for each video program/channel 
processed by the TPEs, and one circular buffer of bit 
rate information for all the data stream that is passed 
directly to the Mux 120 without going through a TPE 
(i.e., in the passthrough mode). Each circular buffer 
has, e.g., 1024 entries, and each entry stores the bit 
rate information of one Tq time slot. The 1024 entries 
is just a design parameter that can vary for different 
implementations. The circular buffer should be large 
enough to hold the data for the 0.6 sec delay. From 
the packet counts, the QLP 130 can calculate the 
instantaneous input bit rate for each Tq time slot as 
follows : 

BitRate (bits per second) = PktCount * 188 * 8 / 
TqPeriod. 

The Mux 120 counts the number of transport packets 
(except null packets) in each data service, which may 
comprise one or more MPEG programs. The QLP 130 uses 
the packet count to compute the instantaneous data 
service input bit rate for each Tq time slot. The Mux 
saves the packet count information in circular buffers 
on the QLP in the same way as the packet count 
information from the TPEs is saved. 



26 



The processes in which the Mux 120 and the TPEs 
write packet count information into the QLP' s circular 
buffers are asynchronous with the Tq ticks. A Tq index 
which is saved with the packet count information is 
5 used to synchronize the QLP with the input packet count 

information during the initialization process. 

The Tq index is maintained by the QLP. The QLP 
sets the Tq index to 0 at initialization, and increases 
it by 1 on every Tq interrupt. The QLP periodically 

10 broadcast the Tq index and the associated time to the 

TPEs 110, 112 and the Mux 120. 

During the transcoding bit rate allocation 
process, the QLP 130 sets aside the bandwidth for the 
pure passthrough video channel (s) and the non-video 

15 channels. Since the transmission bit rate of the 

packets in these passthrough channels has to match the 
bit rate of the corresponding packets at the input, the 
bit rate to set aside for each passthrough video 
channel equals the instantaneous input bit rate at time 

20 PacketCountDelay = PassThroughDelay - 

TcrToTxrDelay 

prior to the current time, where PassThroughDelay 
is the delay of the packets in the video passthrough 
channels (from demux 410 to remux 450), which is fixed 

25 at (0.5 sec. + 6 NTSC frame periods) = 0.7 sec. in the 

example implementation. The non-video PIDs have the 
same amount of delay. 



t • 

27 



TcrToTxrDelay is the delay from the calculation of 
the transcoding rate (current Tq tick) to the 
implementation of the transmission bit rate (FIG. 7) . 
This delay is fixed at (O.Ssec + 1.5 NTSC frame periods 
5 +1.5 NTSC frame periods) =0.6 sec. 

Therefore, PacketCountDelay is a constant equal to 
0.7 sec - 0.6 sec = 0.1 sec. The number of Tq ticks 
equivalent to this delay is: PacketCountDelaylndex = 
(PacketCountDelay / TqPeriod) . 

10 The QLP 130 synchronizes the input packet count 

information with the current Tq interrupt as follows. 

For each circular buffer, the QLP maintains a 10 
bit read pointer. Initially, the QLP searches for the 
entry in the circular buffer whose tqlndex matches the 

15 value of (CurrentTqlndex - PacketCountDelaylndex) . For 

every Tq tick after that, the QLP increases the value 
of the read pointer by one. The QLP also checks the 
continuity of the Tqlndex stored with the packet count 
in the circular buffer. If there is a discontinuity, 

20 the QLP sets a warning flag to the KP 140, and re- 

initializes the read pointer by searching for the 
Tqlndex that matches (CurrentTqlndex - 
PacketCountDelaylndex) . 

For every input frame, the QLP calculates the 

25 average input bit rate over a frame. This computation 

is performed at the same time as the frame's need 
parameter calculation. The average input bit rate is 
used for the calculation of the target frame size. 
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First, the QLP computes the number of integer Tq 
periods straddled by the frame: 

FrameTqCount = (difference between the decode time 
of the next frame and decode time of current frame) / 
TqPeriod, rounding to the next higher integer. 

The QLP computes the duration of the frame from 
FrameTqCount : 

FrameDuration = FrameTqCount * TqPeriod. 

Then, the QLP computes the average input bit rate: 

AvglnBitrate = InPacketCount * 188 * 8 * / 
FrameDuration, 

where InPacketCount is the sum of PacketCount over 
FrameTqCount entries of the video packet count circular 
buffer, starting from the current read pointer. 

6. Bandwidth Allocation 

At every Tq slot, the QLP performs the bandwidth 
allocation procedure. The QLP first assigns the 
bandwidth to the pure passthrough video programs, and 
to the data and audio programs, which are not 
transcoded. The remaining bandwidth is then allocated 
to the remaining channels based on the values of their 
need parameters, and subject to the maximum and minimum 
bit rate constraints. 

6.1. Passthrough video and data channels 

The QLP 130 assigns the transcoding bit rate to 
the pure passthrough channels as follows, 
if ( purePassThrough ) 



t • 

29 



TcodeBitrate = VideoInBitrate 
where VideoInBitrate is the instantaneous input video 
bit rate computed as: 

VideoInBitrate = (PacketCount value stored in the 
5 corresponding video program circular buffer entry at 

the current read pointer ) * 188 * 8 / TqPeriod. 

For each statmux group, the QLP calculates the 
amount of bandwidth that is available for dynamic 
allocation, that is, the amount of bandwidth available 
10 after deducting the bandwidth of the pure passthrough 

channels and the PES alignment overhead bits. A stat 
remux group refers to a group of channels at the 
transcoder 100 that are competing for bandwidth with 
one another. One or more stat remux groups may be used 
15 at the transcoder 100. 

AvailableVideoBitrate = TotalOutputBandwidth - 
(sum of NonVideoInBitrate over all channels) - (sum of 
TcodeBitrate over all pure passthrough channels) - 
(Number of channels that are not pure passthrough * 
20 PesOverheadBitrate) . 

In this equation, TotalOutputBandwidth is the 
total output transport (payload) bandwith available for 
video, audio, and data services in the input streams, 
including system information. This is a user- 
25 configured parameter for the statmux group. 

PesOverheadBitrate is the average overhead bit 
rate for PES alignment, which is a constant: 

PesOverheadBitrate = H * 184 * 8 * 30 = 22.08Kbps. 
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The instantaneous non-video bit rate 
(NonVideoInBitrate) is compute in a similar way as the 
VideoInBitrate : 

NonVideoInBitrate = (PacketCount value stored in 
the corresponding non-video PID' s circular buffer entry 
at the current read pointer ) * 188 * 8 / TqPeriod. 
6.2. Transcoding bit rate allocation 

For each statmux group, the QLP allocates the 
AvailableVideoBitrate among the non-passthrough video 
channels subject to the following constraints: ' 

1. The sum of transcoding bit rates = GroupBandwidth . 
Since the bandwidth available for dynamic allocation 
is variable, and subject to the bandwidth occupied by 
the passthrough components (e.g., non-video data) in 
the transport stream, the group bandwidth is 
expressed as a percentage of the total available 
bandwidth when there is more than one statmux group 
configured for the output transport multiplex. 

2. The sum of the average transcoding bit rate for all 
non-pure-passthrough video channels on any single TPE 
has to be less than an upper bound that is determined 
by the Variable Length Encoder's (380, 440) maximum 
throughput on the TPE. 

3. For a pure passthrough channel, the output bitrate 
should be equal to the input bit rate. A channel may 
be processed as a pure passthrough channel, e.g., to 
preserve its quality. 



4. For any video channel, the output target frame size 
cannot be bigger than the input frame size. This 
translates to the constraint that the average 
transcoding bit rate cannot exceed the average input 
bit rate. 

5. The target frame size cannot be higher than a maximum 
value, nor lower than a minimum value, which are 
provisioned to protect the video buffers. 

The procedure of transcoding bit rate allocation 
is outlined as follows. 

6-2.1. Compute an approximation of the maximum frame 
size 

The maximum transcoded frame size to protect the 
decoder buffer from underflow is given by: 

maxFrameSize = (number of bits transmitted to the 
decoder 452 from the time the first bit of the 
transcoded frame enters the transcoder FIFO 445 to the 
decode time of the frame) - (transcoder FIFO level at 
the time the first bit of the transcoded frame enter 
the FIFO) . 

However, the transcoder FIFO level at the time the 
first bit of the transcoded frame enters the FIFO is 
not known at the time the transcoding bit rate is 
calculated. Therefore, an approximation of the maximum 
transcoded frame size is calculated as follows: 

maxFrameSizeEstimate = delayBitsMax - FifoLevel - 
of f setBitsMax. 
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The value o.f delayBitsMax is the number of bits 
transmitted to the decoder 452 from the last time the 
FIFO level was read to the decode time of the frame, 
and is calculated by: 

delayBitsMax = TqPeriod * sum of transmission bit 
rate values in the transmission bit rate queue for 
Ndelay terms starting from FrameMarker , where: 

Ndelay = Number of Tq slots counting from the time 
when the FifoLevel is read to the time when the frame 
is decoded. 

The value of FifoLevel is the most recent output 
FIFO level of the transcoder. 

The value of of f setBitsMax is the approximate 
number of bits entering the transcoder FIFO from the 
time the FIFO level was last read to the time the first 
bit of the target transcoded frame enters the FIFO. 
This approximation is given by the sum of the initial 
(unbounded) target frame sizes of the frames waiting to 
be transcoded. This is equal to: 

of f setBitsMax = Size of the most recent output 
frame + target frame size of the frame currently being 
transcoded + sum of target frame sizes of the frames 
preceding the current frame that are waiting to be 
transcoded. 

In the approximation of the maximum frame size, it 
is assumed that the number of bits generated by the 
future transcoded frames meets the frame target, and 
the initial frame target values in the QLP' s output 
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queue 132 do not hit the maximum frames size nor the 
minimum frame size. 

6.2.2. Compute an estimate of the minimum frame size 

The minimum transcoded frame size to protect the 
5 decoder 452 from overflow is given by: 

MinFrameSize = (number of bits transmitted to the 
decoder from the time the first bit of the transcoded 
frame enters the transcoder FIFO to the decode time of 
the next frame) - (Size of the decoder's buffer) - 
10 (transcoder FIFO level at the time the first bit of the 

transcoded frame enters the FIFO) . 

It can be show that MinFrameSize is related to 
MaxFrameSize by: 

MinFrameSize = MaxFrameSize + (Number of bits 
15 transmitted to the decoder from the decode time of the 

current frame to the decode time of the next frame) - 
(Size of decoder's buffer). 

Therefore, 

MinFrameSizeEstimate = MaxFrameSizeEstimate + 
20 DeltaBitsMin - DecoderBuf f erSize, 

where DeltaBitsMin = Number of bits transmitted to 
the decoder from the decode time of the current frame 
to the decode time of the next frame, which can be 
calculated by summing the corresponding terms in the 
25 queue of the transmission bit rate. 

In the example implementation, DecoderBuf ferSize 
is the size of the MPEG2 Main Profile, Main Level 
buffer size, which is 1.835Mbits. 
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6.2.3. Compute the maximum transcoding bit rate that 
protects the buffer 

A maximum transcoding bit rate must be set to 
avoid a decoder buffer overflow. The target frame size 
5 of a frame is computed as the input frame size scaled 

by the ratio of the average transcoding bit rate to the 
average input bit rate. Therefore, the maximum 
transcoding bit rate is calculated as follows from the 
maximum frame size, assuming the transcoding bit rate 
10 remains constant until the end of the frame time: 

MaxTcodeBitrate = ( ( MaxFrameSize / OrigFrameSize 
) * AvglnBitrate * FrameTqCount - (Sum of transcoding 
bit rate from the beginning of the frame to the current 
Tq interrupt) * FrameTqlndex ) / (FrameTqCount - 
15 FrameTqlndex) , 

where OrigFrameSize is the number of bits in the 
input frame, FrameTqCount is the number of Tq time 
slots in the frame time, and FrameTqlndex is the number 
of Tq time slots since the start of the frame 

20 (TframeStart) • 

6.2.4. Compute the minimum transcoding bit rate that 
protects the buffer 

A minimum transcoding bit rate must be set to 
avoid a decoder buffer underflow. For each video 
25 service, the minimum transcoding bit rate is computed 

in a manner that is similar to the maximum transcoding 
bit rate: 
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MinTcodeBitrate = ( ( MinFrameSize / OrigFrameSize 
) * AvglnBitrate * FrameTqCount - (Sum of transcoding 
bit rate from the beginning of the frame to the current 
Tq interrupt) * FrameTqlndex ) / (FrameTqCount - 
5 FrameTqlndex) . 

6.2.5. Calculate the maximum aggregated bit rate that 
can be processed by each TPE 

The average output bit rate among all video 
services on any single TPE over a window (e.g., 3 frame 

10 periods) is constrained by the processing power of the 

VLE in the TPE, e.g., the throughput is constrained to 
no more than an average of 12 Mbits/sec. spread (a 
processor-dependent value) over a 3 frame window. At 
any Tq period, the maximum bit rate supported by a TPE 

15 is calculated as follows: 

MaxTpeBitrate = (N Tq * VleThroughput ) - (Sum of 
transcoding bitrate values of every video channel on 
the TPE over the past N Tq -1 Tq 1 interrupts ) , 

where N Tq = number of Tq time slots in the 

20 averaging window, e.g., 3 NTSC frame time (100ms); 

VleThroughput is the throughput of the VLE in terms of 
average bit rate, e.g., 12Mbits/sec. 

6.2.6. Distribute the available bit rate among the 
video channels 

25 The following procedure applies to each stat remux 

group . 

1. The QLP determines the ideal bandwidth allocation in 
absence of minimum and maximum bitrate constraints. 



NominalBitrate = AvailableVideoBitrate / Numbe 
of video channels in the statmux group 

TotalNeed = Sum of NeedParameter over every 
video channel 

if (TotalNeed > 0) 

for (every video channel) 
{ 

NeedBitrate [channel] = 
AvailableVideoBitrate * NeedParameter [channel] / 
TotalNeed 
} 

} else { 

for (every video channel) 

NeedBitrate [channel] = NominalBitrate 

} 

. Each video channel is assigned the MinTcodeBitrate 
the channel. If the sum of MinTcodeBitrate exceeds 
the AvailabeVideoBitrate, the bandwidth is 
distributed in proportion to the MinTcodeBitrate. 

TotalMinBitrate = sum of MinTcodeBitrate over 
the statmux group 

if (TotalMinBitrate > AvailableVideoBitrate) 

{ 

for (every video channel) 
{ 

TcodeTcodeBitrate [channel] = 
MinTcodeTcodeBitrate [channel] * 
AvailableVideoBitrate / TotalMinBitrate 
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} 

AvailableVideoBitrate = 0 

Done with transcoding bit rate allocation. 
} else { 

5 for (every video channel) { 

TcodeBitrate [channel] = MinTcodeBitrate 

[channel] 

NeedBitrate [channel] = Max ( 0, 
NeedBitrate [channel] - MinTcodeBitrate [channel] ) 
10 } 

AvailableVideoBitrate = AvailableVideoBitrate 
- TotalMinBitrate 
} 

3. The QLP then tries to satisfy the user minimum bit 
15 rate requirement. The QLP bounds the user minimum 

bit rate by the MaxTcodeBitrate before applying the 
user minimum bitrate. 

for (every video channel) 
{ 

20 if (UserMinBitrate [channel] > 

MaxTcodeBitrate [c] ) 

minBitrate [channel] = 
MaxTcodeBitrate [channel] 

else 

25 minBitrate [channel] = 

UserMinBitrate [channel] 
} 
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ExtraMinBitrate = Sum of (minBitrate - 
MinTcodeBitrate) over every channel that 
UserMinBitrate is higher than MinTcodeBitrate. 

if (ExtraMinBitrate > AvailableVideoBitrate) 
5 { 

for (every video channel) 

{ 

if (minBitrate [channel] > 
MinTcodeBitrate [channel] ) 
10 { 

extraBitrate = ( 
minBitrate [channel] - MinTcodeBitrate [channel] ) * 
AvailableVideoBitrate / ExtraMinBitrate 

TcodeBitrate [channel] = 
15 TcodeBitrate [channel] + extraBitrate 

needBitrate [channel] = Max (0, 
needBitrate [channel] - extraBitrate) 
} 

} 

20 AvailableVideoBitrate = 0 

} else { 

for (every video channel) 
{ 

if (minBitrate [channel] > 
25 MinTcodeBitrate [channel] ) 

{ 

extraBitrate = minBitrate [channel] 
- MinTcodeBitrate [channel] 



TcodeBitrate [channel] = 
TcodeBitrate [channel] + extraBitrate 

needBitrate [channel] = Max (0, 
needBitrate [channel] - extraBitrate) 

AvailableVideoBitrate = 
AvailableVideoBitrate - extraBitrate 
} 

} 

} 

The QLP calculates the a maximum bit rate value for 
each channel based on the user maximum bit rate, the 
maximum and minimum transcoding bit rates to protect 
the decoder buffer, and the maximum processing bit 
rate that can be supported by each TPE. 

for (every TPE) 

{ 

tpeAvailableBitrate = MaxTpeBitrate [ tpelndex] 
- sum of TcodeBitrate over the TPE 

tpeNeedBitrate = sum of NeedBitrate over the 

TPE 

for (every channel processed by the TPE) 
MaxBitrate [channel] = Min ( 
MaxTcodeBitrate [channel] , 

(tpeAvailableBitrate * NeedBitrate [channel ] / 
tpeNeedBitrate) + TcodeBitrate [channel ] , 

Max ( UserMaxBitrate [channel] , 
MinTcodeBitrate [channel] ) ) 

} 
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5. The QLP assigns the remaining bandwidth in proportion 
to the remaining NeedBitrate values. 

TotalNeedBitrate = sum of needBitrate over 
all video channels 
5 for (every video channel) { 

TcodeBitrate [channel] - TcodeBitrate [channel] 
+ (AvailableVideoBitrate * NeedBitrate [channel ] / 
TotalNeedBitrate) 
} 

10 AvailableVideoBitrate = 0 

6. The QLP applies the maximum bit rate constraint on 
the bit rate allocation. 

for (every video channel) 
{ 

!5 if ( TcodeBitrate [channel] > 

MaxBitrate [channel] ) 
{ 

TcodeBitrate [channel] = 
MaxBitrate [channel] 
20 AvailableVideoBitrate = 

AvailableVideoBitrate + TcodeBitrate [channel] - 
MaxBitrate [channel] 

NeedBitrate [channel] = 0 

} 

25 } 

7. The QLP allocates the extra bandwidth collected from 
the channels that exceed the maximum bit rate. 
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TotalNeedBitrate = Sum of NeedBitrate over every 
channel 

if (AvailableVideoBitrate > 0) 
{ 

5 for (every video channel) 

{ 

extraBitrate = AvailableVideoBitrate * 
NeedBitrate [channel] / TotalNeedBitrate 

if ( extraBitrate + TcodeBitrate [channel] > 
10 MaxBitrate [channel] ) 

extraBitrate = MaxBitrate [channel ] - 
TcodeBitrate [channel] 

TcodeBitrate [channel] = TcodeBitrate [channel] 
+ extraBitrate 
15 AvailableVideoBitrate = 

AvailableVideoBitrate - extraBitrate 
} 

} 

8. The QLP allocates the remaining bandwidth in 
20 proportion to the difference between the current 

allocated bit rate and the maximum bit rate. 

if (AvailableVideoBitrate > 0) 

{ 

TotalHeadroom = sum of 
25 (MaxBitrate [channel] - TcodeBitrate [channel] ) over 

every channel 

for (every channel) 
{ 
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TcodeBitrate [channel] = 
TcodeBitrate [channel] + AvailableVicieoBitrate * ( 
MaxBitrate [channel] - TcodeBitrate [channel] ) / 
TotalHeadroom 
5 } 

} 

The QLP maintains a queue of the transcoding bit 
rate for each video channel. In each Tq interrupt, the 
calculated transcoding bit rate values are stored in 
10 the queues, and retrieved 0.5 seconds later to use as 

transmission bit rate values. 

6.2.7. Initial Target Frame Size Calculation 

At the last Tq slot of a frame, the QLP calculates 
an initial value for the target frame size as follows. 

15 InitialTargetFrameSize = ( OrigFrameSize * 

AvglnBitrate / AvgTcodeBitrate, 

where AvgTcodeBitrate is the average transcoding 
bit rate for the frame, defined as the sum of 
TcodeBitrate over all Tq slots occupied by the frame. 

20 The TPEs may not be ready to transcode a new frame 

at this time, therefore the QLP maintains a target 
frame size queue for each video channel. The 
InitialTargetFrameSize value is stored in the queue for 
the corresponding channel, and is retrieved later when 

25 the TPE is ready to transcode the frame. 

6.3. Passthrough Decision 

At the first Tq interrupt of a frame, the QLP 
decides whether to pass through a frame or not. The 
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pass through decision is made based on the transcoding 
bit rate calculated at the first Tq slot of the frame 
as follows for each channel at the beginning of a new 
frame . 

5 PassThroughBitrate = PassThroughMargin * 

OrigFrameSize / FrameTqCount; 

where PassThroughMargin is a parameter less than 
but close to 1.0, e.g. 0.95; OrigFrameSize is the 
number of bits in the input frame; and FrameTqCount is 
10 the number of Tq slots in the frame. The use of 

PassThroughMargin allows the input frames whose size is 
slightly higher than the target frame size to be passed 
through, thereby preserving the quality of the frame, 
and also saving transcoder processing cycles. 
15 if ( TcodeBitrate > PassThroughBitrate ) 

{ 

Pass through the entire frame. 
} else { 

Transcode the frame. 

20 } 

6.4. Target frame size calculation 

The QLP calculates the maximum frame size and the 
minimum frame size values based on the latest buffer 
level information as soon as it receives a message from 
25 the TPE that signals the TPE is ready to transcode a 

new frame. The QLP then pulls the target frame size 
out from the target frame size queue 132, and computes 
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the final value of the target frame size using maximum 
and minimum frame size constraints. 

6.4.1. Compute the maximum frame size 

The QLP calculates the maximum frame size to 
5 protect the decoder buffer from underflow. The 

calculation is similar to that of the approximate 
maximum frame size calculation during the Tq interrupts 

(6.2.1) : 

MaxFrameSize = DelayBits - FifoLevel - 
10 LastOutputFrameSize . 

The value of DelayBits is the number of bits 
transmitted to the decoder from the time the FIFO level 
was read to the decode time of the frame, and can be 
calculated by summing the corresponding transmission 
15 bit rate values currently in the transmission bit rate 

queue . 

The value of FifoLevel is the transcoder FIFO 
level latched by the transcoder. That is, the 
FifoLevel is read by the transcoder and passed to the 
20 QLP. 

6.4.2. Compute the minimum frame size 

The QLP calculates the minimum frame size to 
protect the decoder buffer from underflow. The 
calculation is similar to that of the approximate 
25 minimum frame size calculation during the Tq interrupts 

(6.2.2) . The minimum frame size is related to the 
maximum frame size by: 



MinFrameSize = MaxFrameSize + (Number of bits 
transmitted to the decoder from decode time of the 
current frame to the decode time of the next frame) - 
(Size of decoder's buffer). 

As mentioned before, for MPEG2 Main Profile Main 
Level, the decoder's buffer size is 1.835 Mbits. 

6.4.3. Compute the carryover from the previous frame 
The transcoders may not be able to generate 

exactly the number of bits equal to the target frame 
size. The surplus or deficit of bits from transcoding 
the previous frame is lumped in with the target frame 
size of the current frame. This deviation (surplus or 
deficit) is calculated as: 

FrameCarryOver = LastOutputFrameSize - 
(TargetFrameSize of previous frame) . 

6.4.4. Compute the Target Frame size 

The QLP pulls the InitialTargetFrameSize value out 
from the target frame size queue of the corresponding 
video channel, and bounds the target frame size by the 
maximum and minimum values: 

TargetFrameSize = Min ( MaxFrameSize, Max ( 
MinFrameSize, InitialTargetFrameSize + FrameCarryOver ) 
) . 

The QLP then sends the values of MinFrameSize, 
MaxFrameSize, and TargetFrameSize to the TPEs. These 
values are used to guide the rate control of the 
transcoding process. 
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6.5. Quantization control 

Within a frame, the following algorithm is used 
for calculating the quantization scale value for every 
macroblock to be transcoded. A new quantization scale 
5 Q New is calculated by scaling the quantization scale of 

the input macroblock Q 01d by a targeted bit reduction 
ratio, RNew/Roid- Typically, each macroblock has a 
quantizer scale. However, a group of macroblocks, such 
as in a slice or other grouping, may be associated with 
10 a common quantizer scale. In this case, a new 

quantization scale is determined for the group. 
Initialization: 

Roid = Original number of bits in the frame. 
RNew = TargetFrameSize = Target number of bits 
15 to be generated by transcoding/requantizing the frame. 

For every slice, do: 
{ 

QNew = Qoid * Roid / ^New 

/* Update R 0 id and R New after requantizing a slice: 

20 */ 

Roid = Roid - original number of bits in the slice. 
Rwew = RNew - new number of bits generated by 
transcoding (e.g., including requantizing) the slice. 
} 

25 For QNew/ rounding to the next higher integer, or 

to the closest integer, may be used. The above formula 
should result in the frame being transcoded to the 
target frame size. The transcoded frame size may go 



over the target, but it should not exceed the maximum 
frame size. 

However, if the maximum frame size is reached very 
early in the frame, which can happen, e.g., if the 
quantization scale at the beginning of the frame is 
low, thereby generating a lot of bits, the quantizer 
scale is set to the maximum level (coarsest quantizing) 
and the rest of the frame will consequently have very 
poor quality. To avoid this, a minimum number of bits 
per macroblock, mb_budget, are allocated. As the frame 
is transcoded, if a running count of the number of bits 
used grows too large, i.e., the number of bits used 
reaches a certain level, which is adjusted as 
requantization of the frame progresses, a panic 
quantizer is set for a short time until there are 
enough bits left for the remaining macroblocks to have 
mb_budget number of bits. That is, the panicJLevel is 
a quantizer level to try to force the MBs to have 
mb__budget or smaller number of bits. This spreads the 
panic quantizer over the frame, such that only a 
portion of the frame may go into the panic mode. To 
achieve this, at the beginning of each frame, 
initialize the following variables: 

mb_budget = T ** etFrameSize .$ 

number _mbs 

panic _ level ~ MaxFrameSize -mb_ budget * number _ mbs 
% determines the minimum number of bits allocated 
to each macroblock as a fraction of the average number 
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of bits per macroblock using the frame target size, T Q . 
The range is 0<£<1. A £ of H-l/2 may be suitable for 
most cases. If £ is too big, the panic condition may 
be triggered too early; if £ is zero, then the panic 
condition may trigger too late, whereby the rest of the 
frame is stuck in panic mode. 

After each macroblock is coded, 

panic _ level = panic _ level - bits _ used _mb + mb_ budget 

if (panic_level < 0) 
Q Me w = MAX_QL; 

where MAX_QL = 112 (e.g., the applicable maximum 
QL for the system) . 

If the frame size is less than the minimum frame 
size, zeros are appended to the end of the bitstream, 
such that the frame size is equal to, or greater than, 
the minimum frame size. 
6.6. PCR slot 

The MPEG standard requires the PCR (Program Clock 
Reference) to be sent at a maximum interval of 100ms. 
The actual PCR value is not known until the 
transmission time, so the transcoder creates a 
placeholder slot for the PCR. 

From the target frame size, the QLP estimates the 
time used for transmitting the frame, hence the minimum 
number of PCRs required to be inserted in the frame to 
satisfy the maximum PCR interval requirement. In 
satisfying this requirement, note that while uncoded 
pictures have constant duration (1/30 sec), coded 
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bitstreams may have a variable duration for each frame. 
For example, if a frame has 100,000 bits and is 
transmitted at 1 Mbps, the duration is 0.1 sec. If the 
frame is transmitted at 2Mbps, then the duration is 
5 0.05 sec. The amount of time required to transmit the 

frame (or, more precisely, the time lapse from the time 
the first bit of the frame leaves the transcoder's 
output buffer (FIFO) 445 to the time the last bit of 
the frame leaves the FIFO) is estimated as: 
10 TxFrameDuration = TargetFrameSize / (minimum value 

in the transmission bit rate queue) . 

The minimum number of PCRs to insert during the 
frame is: 

MinPcrCount = TxFrameDuration / MaxPcrSeparation, 
15 round up to the nearest integer, 

where MaxPcrSeparation is the maximum separation 
between PCRs as required by MPEG (100 ms) . The value 
of MaxPcrSeparation=80 ms is used to provide a 20 ms 
margin . 

20 Accordingly, it can be seen that the present 

invention provides an efficient statistical 
remultiplexer for processing data in a number of 
channels that include video data. In one aspect of the 
invention, transcoding of the video data is delayed 

25 while statistical information is obtained from the 

data. Bit rate need parameters for the data are 
determined based on the statistical information, and 
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the video data is transcoded based on the respective 
bit rate need parameters following the delaying. 

In another aspect of the invention, a transcoding 
bit rate for video frames at the stat remux is updated 
5 a plurality of times at successive intervals to allow a 

closer monitoring of the bit rate. Moreover, minimum 
and maximum bounds for the transcoding bit rate are 
updated in each interval. Thus, a portion of a frame 
is transcoded in a first interval, then the transcoding 

10 bit rate is updated, then a second portion of the frame 

is transcoded in a second interval, and so forth. 

In yet another aspect of the invention, the pre- 
transcoding quantization scales of the macroblocks in a 
frame are scaled to provide corresponding new 

15 quantization scales for transcoding based on a ratio of 

a pre-transcoding amount of data in the frame and a 
target, post-transcoding amount of data for the frame. 
Moreover, the quantization scales are adjusted for 
different portions of the frame as the portions are 

20 transcoded to ensure that a minimum amount of 

transcoding bandwidth is allocated to each macroblock. 

Although the invention has been described in 
connection with various preferred embodiments, it 
should be appreciated that various modifications and 

25 adaptations may be made thereto without departing from 

the scope of the invention as set forth in the claims. 
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What is claimed is: 

1. A method for processing data in a statistical 
remultiplexer that receives a plurality of channels, 
comprising the steps of: 

recovering video frames from the channels; 

delaying transcoding of the video frames while 
obtaining statistical information therefrom; 

determining respective bit rate need parameters 
for the video frames according to the obtained 
statistical information thereof; and 

transcoding the respective video frames in 
accordance with the respective bit rate need parameters 
following the delaying thereof. 

2. The method of claim 1, wherein: 

the delaying of the video frames is achieved by 
storage thereof in a lookahead buffer. 

3. The method of claim 1, comprising the further 
steps of: 

storing the respective bit rate need parameters in 
a storage device; and 

recovering the bit rate need parameters from the 
storage device for the respective video frames for use 
in transcoding thereof. 



4. The method of claim 1, wherein: 
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the statistical information of a respective video 
frame that is used for determining the respective bit 
rate need parameter comprises an average quantizer 
scale value thereof. 

5. The method of claim 1, wherein: 

the statistical information of a respective video 
frame that is used for determining the respective bit 
rate need parameter comprises a number of bits therein. 

6. The method of claim 1, wherein: 

the statistical information of a respective video 
frame that is used for determining the respective bit 
rate need parameter comprises an average bit rate 
associated therewith. 

7. The method of claim 1, wherein: 

the statistical information of a respective video 
frame that is used for determining the respective bit 
rate need parameter comprises a number of bits in 
macroblocks therein. 

8. The method of claim 1, wherein: 

the statistical information of a respective video 
frame that is used for determining the respective bit 
rate need parameter comprises a macroblock resolution 
thereof . 



9. A statistical remultiplexer that receives a 
plurality of channels, comprising: 

means for recovering video frames from the 
channels ; 

means for delaying transcoding of the video frames 
while obtaining statistical information therefrom; 

means for determining respective bit rate need 
parameters for the video frames according to the 
obtained statistical information thereof; and 

means for transcoding the respective video frames 
in accordance with the respective bit rate need 
parameters following the delaying thereof. 

10. A method for processing data in a statistical 
remultiplexer that receives a plurality of channels 
comprising video frames, comprising the steps of: 

updating a transcoding bit rate for at least a 
particular video frame a plurality of times at 
successive intervals as transcoding thereof progresses; 

bounding the updated transcoding bit rates by at 
least one of minimum and maximum levels that are also 
updated in each of the successive intervals to provide 
corresponding bounded and updated transcoding bit 
rates; and 

allocating the bounded and updated transcoding bit 
rates for transcoding corresponding portions of the 
particular video frame in the successive intervals. 
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11. The method of claim 10, wherein: 
the successive intervals are periodic. 

12. The method of claim 10, wherein: 

the updated transcoding bit rates are bounded by 
both minimum and maximum levels that are updated in 
each of the successive intervals to provide the 
corresponding bounded and updated transcoding bit 
rates . 

13. The method of claim 10, comprising the 
further step of: 

computing a target frame size for each video frame 
that indicates an amount of data that is expected to 
result from transcoding thereof; 

wherein video frames whose associated target frame 
size is greater than a number of pre-transcoding bits 
thereof bypass transcoding. 

14. The method of claim 10, comprising the 
further step of: 

computing a target frame size for each video frame 
that indicates an amount of data that is expected to 
result from transcoding thereof; 

wherein video frames whose associated target frame 
size is within a predetermined difference from a number 
of pre-transcoding bits thereof bypass transcoding. 
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15. The method of claim 10, comprising the 
further step of: 

allocating a variable amount of transmission 
bandwidth for passthrough data of the plurality of 
channels; and 

adjusting an amount of transmission bandwidth for 
transcoding the video frames in the plurality of 
channels in accordance with said allocating step. 

16. The method of claim 10, wherein: 

the statistical remultiplexer outputs a transport 
stream comprising a plurality of statistical 
remultiplexing groups of channels; and 

respective portions of a total available 
transmission bandwidth are used to configure respective 
ones of the statistical remultiplexing groups. 

17. The method of claim 10, comprising the 
further steps of: 

delaying the updated and bounded transcoding bit 
rates in the successive intervals according to a delay 
associated with an associated transcoding engine; and 

allocating transmission bit rates in accordance 
with the updated and bounded transcoding bit rates 
after the delaying thereof for transmitting the 
particular video frame after transcoding thereof. 



18. The method of claim 17, comprising the 
further step of: 

providing a number of packets of the particular 
video frame after transcoding thereof to a multiplexer 
for multiplexing with packets of at least one other of 
the channels in accordance with the allocated 
transmission bit rates. 

19. The method of claim 10, comprising the 
further step of: 

computing a target frame size for the particular 
video frame that indicates an amount of data that is 
expected to result from transcoding the particular 
video frame; 

wherein the transcoding bit rate for the 
particular video frame in the successive intervals is 
determined in accordance with the target frame size. 

20. The method of claim 19, wherein: 

the target frame size is bounded by at least one 
of minimum and maximum values that are updated in the 
successive intervals . 

21. The method of claim 20, wherein: 

the at least one of minimum and maximum values are 
determined in accordance with a current fullness of an 
associated transcoding engine buffer. 
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22. The method of claim 20, wherein: 

the at least one of minimum and maximum values are 
determined so as to protect an associated decoder 
buffer from underflow or overflow. 

23. The method of claim 19, comprising the 
further step of: 

estimating a time for transmitting at least one 
packet comprising transcoded data of the particular 
video frame according to the target frame size. 

24. The method of claim 19, comprising the 
further step of: 

estimating a time for inserting clock reference 
data into at least one packet comprising transcoded 
data of the particular video frame according to the 
target frame size. 

25. The method of claim 19, wherein: 

the target frame size is bounded by at least one 
of minimum and maximum values that are updated in the 
successive intervals . 

26. A statistical remultiplexer that receives a 
plurality of channels comprising video frames, 
comprising: 
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means for updating a transcoding bit rate for at 
least a particular video frame a plurality of times at 
successive intervals as transcoding thereof progresses; 

means for bounding the updated transcoding bit 
rates by at least one of minimum and maximum levels 
that are also updated in each of the successive 
intervals to provide corresponding bounded and updated 
transcoding bit rates; and 

means for allocating the bounded and updated 
transcoding bit rates for transcoding corresponding 
portions of the particular video frame in the 
successive intervals . 

27. A method for processing data in a statistical 
remultiplexer that receives a plurality of channels 
comprising video frames, including a particular video 
frame that comprises a plurality of macroblocks in 
associated portions of the particular video frame, each 
macroblock having an associated old quantization scale 
when the particular video frame is input to the 
statistical multiplexer, comprising the steps of: 

computing a target amount of data that is expected 
to result from transcoding the particular video frame;, 
and 

determining new quantization scales for use in 
transcoding corresponding macroblocks in a first 
portion of the particular video frame in accordance 
with (a) the corresponding old quantization scales, and 



a ratio of (b) a pre-transcoding amount of data in the 
particular video frame to (c) the target amount of 
data . 

28. The method of claim 27, comprising the 
further step of: 

rounding the new quantizer scales to an integer 
value . 

29. The method of claim 27, wherein: 

the portions comprise respective slices of the 
particular video frame. 

30. The method of claim 27, wherein: 

the new quantizer scales are adjusted, if 
necessary, to be no finer than the corresponding old 
quantizer scales. 

31. The method of claim 27, wherein: 

the new quantizer scales are adjusted, if 
necessary, to ensure that the corresponding macroblocks 
receive a minimum number of bits for transcoding 
thereof . 

32. The method of claim 27, comprising the 
further step of: 

determining new quantization scales for use in 
transcoding macroblocks in a second portion of the 
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particular video frame in accordance with (a) the 
corresponding old quantization scales, and a ratio of 
(b) a pre-transcoding amount of data in remaining 
portions of the particular video frame to (c) a target 
amount of data that is expected to result from 
transcoding the remaining portions. 

33. The method of claim 32, wherein: 

the target amount of data that is expected to 
result from transcoding the remaining portions is 
determined by decrementing the target amount of data 
that is expected to result from transcoding the 
particular video frame by an amount of data generated 
in transcoding the first portion. 

34. The method of claim 32, wherein: 

the pre-transcoding amount of data in the 
remaining portions is determined by decrementing the 
pre-transcoding amount of data in the particular video 
frame by a pre-transcoding amount of data in the first 
portion . 

35. The method of claim 27, comprising the 
further step of: 

maintaining a running count of an amount of data 
used in transcoding the portions; 

wherein, if the running count exceeds a maximum 
level, a panic quantization scale is set until there 
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are enough bits left for the macroblocks in the 
remaining portions to receive a minimum number of bits 
for transcoding thereof. 

36. The method of claim 35, wherein: 

the maximum level is adjusted downwards after 
transcoding of each portion. 

37. The method of claim 27, wherein: 
each macroblock has its own associated old 

quantization scale . 

38. The method of claim 27, wherein: 

a group of the macroblocks has its own associated 
old quantization scale. 

39. A statistical remultiplexer that receives a 
plurality of channels comprising video frames, 
including a particular video frame that comprises a 
plurality of macroblocks in associated portions of the 
particular video frame, each macroblock having an 
associated old quantization scale when the particular 
video frame is input to the statistical multiplexer, 
comprising: 

means for computing a target amount of data that 
is expected to result from transcoding the particular 
video frame; and 
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means for determining new quantization scales for 
use in transcoding corresponding macroblocks in a first 
portion of the particular video frame in accordance 
with (a) the corresponding old quantization scales, and 
a ratio of (b) a pre-transcoding amount of data in the 
particular video frame to (c) the target amount of 
data . 
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ABSTRACT 

An efficient statistical remultiplexer for 
processing a number of channels that include video 
data. In one aspect, transcoding of the video data is 
5 delayed while statistical information is obtained from 

the data. Bit rate need parameters for the data are 
determined based on the statistical information, and 
the video data is transcoded based on the respective 
bit rate need parameters following the delay. In 

10 another aspect, a transcoding bit rate for video frames 

is updated a plurality of times at successive intervals 
to allow a closer monitoring of the bit rate. Minimum 
and maximum bounds for the transcoding bit rate, e.g., 
for buffer overflow and underflow protection, are also 

15 updated in each interval. In another aspect, the pre- 

transcoding quantization scales of the macroblocks in a 
frame are scaled to provide corresponding new 
quantization scales for transcoding based on a ratio of 
a pre-transcoding amount of data in the frame and a 

20 target, post-transcoding amount of data for the frame. 

The quantization scales are adjusted for different 
portions of the frame as transcoding progresses to 
ensure that a minimum amount of transcoding bandwidth 
is allocated to each macroblock. 
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